Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add implementation of AbstractTrees-interface #158

Merged
merged 12 commits into from
May 16, 2022

Conversation

roland-KA
Copy link
Collaborator

  • the implementation of the AbstractTree-interface is in file src/abstract_trees.jl
  • for importing and exporting the necessary identfiers some changes to src/DecisionTree.jl have been made
  • test cases are in test/miscellaneous/abstract_tree_test.jl; runtests.jl has been extended to include this test file

@ablaom
Copy link
Member

ablaom commented May 8, 2022

@roland-KA My guess about the fail is that it is caused by the existence of a Manifest.toml file in the repo, something that has previously been overlooked. (Manifests generated by different julia versions are not necessarily compatible.) This file should be removed, so that it is generated from scratch in CI every time.

You will want to add Manifest.toml to the .gitignore, so that it is not accidentally added to the git index again.

src/abstract_trees.jl Outdated Show resolved Hide resolved
Comment on lines 1 to 16
"""
Implementation of the `AbstractTrees.jl`-interface
(see: [AbstractTrees.jl](https://github.com/JuliaCollections/AbstractTrees.jl)).

The functions `children` and `printnode` make up the interface traits of `AbstractTrees.jl`
(see below for details).

The goal of this implementation is to wrap a `DecisionTree` in this abstract layer,
so that a plot recipe for visualization of the tree can be created that doesn't rely
on any implementation details of `DecisionTree`. The plot recipe is part of `MLJ.jl`
where all tree-like models may be visualized using this approach.

For a more detailed explanation of this concept have a look at the follwing article
in "Towards Data Science":
["If things are not ready to use"](https://towardsdatascience.com/part-iii-if-things-are-not-ready-to-use-59d2db378bec)
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing these are really comments to assist in a review of this PR since, for example, they refer to non-existent functionality in a downstream package 😄 . Perhaps they should be moved to the GitHub comment opening this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I've put these comments on purpose in here, having the following audiences in mind:

  1. Somebody who looks at the source code of DecisionTree.jl has no clue what the purpose and objective of this file is. So I wanted to present the basic "why and what".
  2. We've discussed that this approach is a basic concept that could also be used for other MLJ models in order to "prepare" them for visualization. So this implementation may serve as a template for these future implementations. Therefore I wanted to give people who are planning such an implementation a bit more information.

... and yes, I anticipated a bit the future 😀🤷‍♂️; perhaps I will reformulate that section a bit. But basically I would like to keep these comments.

Maybe there will be a better place to put them in the future, but for the moment I think it's ok here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and yes, I anticipated a bit the future 😀🤷‍♂️; perhaps I will reformulate that section a bit. But basically I would like to keep these comments.

Could you do that? I mean, remove explicit reference to MLJ?

src/DecisionTree.jl Outdated Show resolved Hide resolved
Copy link
Member

@ablaom ablaom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice PR with attention to detail and good test coverage.

@bensadeghi Apart from my minor comments, this PR looks good to me. But as this is more than just "maintenance", perhaps you would like to review it also?

Of course we need tests to pass (see earlier comment).

@ablaom
Copy link
Member

ablaom commented May 9, 2022

Looks like you still have the Manifest.toml in your branch: https://github.com/roland-KA/DecisionTree.jl/tree/abstract-tree

@codecov-commenter
Copy link

codecov-commenter commented May 10, 2022

Codecov Report

❗ No coverage uploaded for pull request base (dev@17cb46d). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head eabad6a differs from pull request most recent head d511ed3. Consider uploading reports for the commit d511ed3 to get more accurate results

@@          Coverage Diff           @@
##             dev     #158   +/-   ##
======================================
  Coverage       ?   89.51%           
======================================
  Files          ?       10           
  Lines          ?      992           
  Branches       ?        0           
======================================
  Hits           ?      888           
  Misses         ?      104           
  Partials       ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 17cb46d...d511ed3. Read the comment docs.

@roland-KA
Copy link
Collaborator Author

I forgot to remove and untracked Manifest.toml explicitly. Only adding it to .gitignore is not enough. That should be fine now.

The remaining CI error on "Julia 1.0 - windows-latest" doesn't seem to be related to our recent changes. It occurs during the iris.jl-test:

iris.jl: Test Failed at D:\a\DecisionTree.jl\DecisionTree.jl\test\classification\iris.jl:101
[474](https://github.com/JuliaAI/DecisionTree.jl/runs/6367331199?check_suite_focus=true#step:6:474)
  Expression: mean(accuracy) > 0.9
[475](https://github.com/JuliaAI/DecisionTree.jl/runs/6367331199?check_suite_focus=true#step:6:475)
   Evaluated: 0.9 > 0.9

@ablaom
Copy link
Member

ablaom commented May 10, 2022

Yes, the fail looks unrelated.

I have a suggestion regarding the output of printnode. Currently we have:

firstFt --> 0.7
├─ a (2/3)
└─ secondFt --> 0.5
   ├─ b (2/3)
   └─ c (2/3)

I find the "-->" a bit cryptic/uninformative. Can I suggest either:

firstFt <  0.7 ?
├─ a (2/3)
└─ secondFt < 0.5 ?
   ├─ b (2/3)
   └─ c (2/3)

or maybe,

firstFt: Threshold 0.7
├─ a (2/3)
└─ secondFt: Threshold 0.5
   ├─ b (2/3)
   └─ c (2/3)

with a preference for the first.

@roland-KA
Copy link
Collaborator Author

Good point! 😊

I've tried several variations during implementation, among them '<'. But as I didn't find any documentation stating clearly that '<' and not '<=' is correct, I abandoned the idea. With the new doc-string you've added to the built-in print_tree function, things are much clearer now. So I will change it accordingly. 👍

@roland-KA
Copy link
Collaborator Author

... and the CI error on "Julia 1.0 - windows-latest" didn't occur this time. May be some floating-point precision problem?

src/abstract_trees.jl Outdated Show resolved Hide resolved
src/abstract_trees.jl Outdated Show resolved Hide resolved
@ablaom
Copy link
Member

ablaom commented May 11, 2022

@roland-KA I think we're almost there. My only remaining objection is the explicit references to MLJ. This package is completely independent of MLJ (and MLJ is not the only ML framework with a DecisionTree.jl interface). So I don't think it's appropriate. Anything MLJ-specific can be added to the model doc-strings at MLJDecisionTree.jl .

The intermittently failing test is unrelated, and too minor to disrupt this PR. If it recurs in aJulia 1.6 test, we can investigate.

@roland-KA
Copy link
Collaborator Author

@roland-KA I think we're almost there. My only remaining objection is the explicit references to MLJ. This package is completely independent of MLJ (and MLJ is not the only ML framework with a DecisionTree.jl interface). So I don't think it's appropriate. Anything MLJ-specific can be added to the model doc-strings at MLJDecisionTree.jl .

Ok, I understand your issue. I've deleted the direct reference to MLJ and left only the explanation of the general idea.

@ablaom
Copy link
Member

ablaom commented May 13, 2022

@roland-KA I've belatedly noticed that @bensadeghi (the pkg author) has requested we add documentation in the README.md. I think this can be a one-liner just referencing your excellent doc-strings.

I promise this is the very last item 😳 . Thanks for your patience.

@roland-KA
Copy link
Collaborator Author

😀 ... no problem, here it is.

And yes, it would be good, if we could close the PR now, because I will be on vacation for the next two weeks 🚅 ☀️😊

@ablaom ablaom merged commit 52007b1 into JuliaAI:dev May 16, 2022
This was referenced May 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants